Language identification on code-switching utterances using multiple cues
نویسندگان
چکیده
Code-switching speech is an utterance containing two or more languages. Usually, the switching linguistic unit is in clause or word levels. In this paper, a two-stage framework is proposed, containing a language identifier and then a speech recognizer, to evaluate on a Mandarin-Taiwanese codeswitching utterance. In the language identifier, we use multiple cues including acoustic, prosodic and phonetic features. In order to integrate the cues to distinguish one language from another, we used a maximum a posteriori decision rule to connect an acoustic model, a duration model and a language model. In the experiments, we have achieved 34.5% (LID) and 17.7% (ASR) error rate reduction comparing with one stage LVCSR-based system.
منابع مشابه
Modeling code-Switching speech on under-resourced languages for language identification
This paper presents an integration of phonotactic information to perform language identification (LID) in a mixed-language speech. A single-pass front-end recognition system is employed to convert the spoken utterances into a statistical occurrence of phone sequences. To process such phone sequences, a hidden Markov model (HMM) is utilized to build robust acoustic models that can handle multipl...
متن کاملDeveloping Language-tagged Corpora for Code-switching Tweets
Code-switching, where a speaker switches between languages mid-utterance, is frequently used by multilingual populations worldwide. Despite its prevalence, limited effort has been devoted to develop computational approaches or even basic linguistic resources to support research into the processing of such mixedlanguage data. We present a user-centric approach to collecting code-switched utteran...
متن کاملThe effect of Code switching on the Acquisition of Object Relative Clauses by Iranian EFL Learners
This study attempted to investigate the impact of teacher’s code-switching on the acquisition of a problematic grammatical structure, namely, object relative clauses, by intermediate EFL learners. Moreover, a secondary objective of the study was to determine the EFL learners’ attitudes and opinions regarding the effectiveness of teacher’s code-switching in their learning of a specific aspect of...
متن کاملA Language Modeling Approach to Identifying Code-Switched Sentences and Words
Globalization and multilingualism contribute to code-switching – the phenomenon in which speakers produce utterances containing words or expressions from a second language. Processing code-switched sentences is a significant challenge for multilingual intelligent systems. This study proposes a language modeling approach to the problem of codeswitching language processing, dividing the problem i...
متن کاملMotivational Determinants of Code-Switching in Iranian EFL Classrooms
“Code-Switching”, an important issue in the field of both language classroom and sociolinguistics, has been under consideration in investigations related to bilingual and multilingual societies. First proposed by Haugen (1956) and later developed byGrosjean (1982), the termcode-switching refers to language alternation during communication. Although code-switching is unavoidable in bilingual and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008